Lambek Calculus
   HOME

TheInfoList



OR:

Categorial grammar is a family of formalisms in natural language syntax that share the central assumption that syntactic constituents combine as
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
s and
arguments An argument is a statement or group of statements called premises intended to determine the degree of truth or acceptability of another statement called conclusion. Arguments can be studied from three main perspectives: the logical, the dialectic ...
. Categorial grammar posits a close relationship between the syntax and semantic composition, since it typically treats syntactic categories as corresponding to semantic types. Categorial grammars were developed in the 1930s by
Kazimierz Ajdukiewicz Kazimierz Ajdukiewicz (12 December 1890 – 12 April 1963) was a Polish philosopher and logician, a prominent figure in the Lwów–Warsaw school of logic. He originated many novel ideas in semantics. Among these was categorial grammar, a highly ...
and in the 1950s by
Yehoshua Bar-Hillel Yehoshua Bar-Hillel ( he, יהושע בר-הלל; 8 September 1915, in Vienna – 25 September 1975, in Jerusalem) was an Israeli philosopher, mathematician, and linguist. He was a pioneer in the fields of machine translation and formal linguis ...
and
Joachim Lambek Joachim "Jim" Lambek (5 December 1922 – 23 June 2014) was a German-born Canadian mathematician. He was Peter Redpath Emeritus Professor of Pure Mathematics at McGill University, where he earned his PhD degree in 1950 with Hans Zassenhaus a ...
. It saw a surge of interest in the 1970s following the work of
Richard Montague Richard Merritt Montague (September 20, 1930 – March 7, 1971) was an American mathematician and philosopher who made contributions to mathematical logic and the philosophy of language. He is known for proposing Montague grammar to formalize ...
, whose Montague grammar assumed a similar view of syntax. It continues to be a major paradigm, particularly within formal semantics.


Basics

A categorial grammar consists of two parts: a lexicon, which assigns a set of types (also called categories) to each basic symbol, and some
type inference Type inference refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistic ...
rules, which determine how the type of a string of symbols follows from the types of the constituent symbols. It has the advantage that the type inference rules can be fixed once and for all, so that the specification of a particular language grammar is entirely determined by the lexicon. A categorial grammar shares some features with the
simply typed lambda calculus The simply typed lambda calculus (\lambda^\to), a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor (\to) that builds function types. It is the canonical and simplest example of a typed lambda c ...
. Whereas the lambda calculus has only one function type A \rightarrow B, a categorial grammar typically has two function types, one type that is applied on the left, and one on the right. For example, a simple categorial grammar might have two function types B/A\,\! and A\backslash B. The first, B/A\,\!, is the type of a phrase that results in a phrase of type B\,\! when followed (on the right) by a phrase of type A\,\!. The second, A\backslash B\,\!, is the type of a phrase that results in a phrase of type B\,\! when preceded (on the left) by a phrase of type A\,\!. The notation is based upon algebra. A fraction when multiplied by (i.e.
concatenated In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalisations of concatenat ...
with) its denominator yields its numerator. As concatenation is not
commutative In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Most familiar as the name of ...
, it makes a difference whether the denominator occurs to the left or right. The concatenation must be on the same side as the denominator for it to cancel out. The first and simplest kind of categorial grammar is called a basic categorial grammar, or sometimes an AB-grammar (after
Ajdukiewicz Kazimierz Ajdukiewicz (12 December 1890 – 12 April 1963) was a Polish philosopher and logician, a prominent figure in the Lwów–Warsaw school of logic. He originated many novel ideas in semantics. Among these was categorial grammar, a highl ...
and Bar-Hillel). Given a set of primitive types \text\,\!, let \text(\text)\,\! be the set of types constructed from primitive types. In the basic case, this is the least set such that \text\subseteq \text(\text) and if X, Y\in \text(\text) then (X/Y), (Y\backslash X) \in \text(\text). Think of these as purely formal expressions freely generated from the primitive types; any semantics will be added later. Some authors assume a fixed infinite set of primitive types used by all grammars, but by making the primitive types part of the grammar, the whole construction is kept finite. A basic categorial grammar is a tuple (\Sigma, \text, S, \triangleleft) where \Sigma\,\! is a finite set of symbols, \text\,\! is a finite set of primitive types, and S \in \text(\text). The relation \triangleleft is the lexicon, which relates types to symbols (\triangleleft) \subseteq \text(\text) \times \Sigma. Since the lexicon is finite, it can be specified by listing a set of pairs like TYPE\triangleleft\text. Such a grammar for English might have three basic types (N,NP, \text S)\,\!, assigning
count noun In linguistics, a count noun (also countable noun) is a noun that can be modified by a quantity and that occurs in both singular and plural forms, and that can co-occur with quantificational determiners like ''every'', ''each'', ''several'', ...
s the type N\,\!, complete noun phrases the type NP\,\!, and sentences the type S\,\!. Then an
adjective In linguistics, an adjective (abbreviated ) is a word that generally modifies a noun or noun phrase or describes its referent. Its semantic role is to change information given by the noun. Traditionally, adjectives were considered one of the ma ...
could have the type N/N\,\!, because if it is followed by a noun then the whole phrase is a noun. Similarly, a determiner has the type NP/N\,\!, because it forms a complete noun phrase when followed by a noun. Intransitive
verb A verb () is a word ( part of speech) that in syntax generally conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occurrence (''happen'', ''become''), or a state of being (''be'', ''exist'', ''stand''). In the usual descr ...
s have the type NP\backslash S, and transitive verbs the type (NP\backslash S)/NP. Then a string of words is a sentence if it has overall type S\,\!. For example, take the string "the bad boy made that mess". Now "the" and "that" are determiners, "boy" and "mess" are nouns, "bad" is an adjective, and "made" is a transitive verb, so the lexicon is . and the sequence of types in the string is now find functions and appropriate arguments and reduce them according to the two
inference rule In the philosophy of logic, a rule of inference, inference rule or transformation rule is a logical form consisting of a function which takes premises, analyzes their syntax, and returns a conclusion (or conclusions). For example, the rule of ...
s X\leftarrow X/Y,\; Y and X\leftarrow Y,\; Y\backslash X: .\qquad NP/N,\; N/N,\; N,\; (NP\backslash S)/NP,\; \underbrace
.\qquad NP/N,\; N/N,\; N,\; \underbrace
.\qquad NP/N,\; \underbrace, \qquad (NP\backslash S)
.\qquad \underbrace,\; \qquad (NP\backslash S)
.\qquad \qquad\underbrace
.\qquad \qquad\qquad\quad\;\;\; S The fact that the result is S\,\! means that the string is a sentence, while the sequence of reductions shows that it can be parsed as ((the (bad boy)) (made (that mess))). Categorial grammars of this form (having only function application rules) are equivalent in generative capacity to context-free grammars and are thus often considered inadequate for theories of natural language syntax. Unlike CFGs, categorial grammars are
lexicalized In linguistics, lexicalization is the process of adding words, set phrases, or word patterns to a language's lexicon. Whether '' word formation'' and ''lexicalization'' refer to the same process is controversial within the field of linguistics. M ...
, meaning that only a small number of (mostly language-independent) rules are employed, and all other syntactic phenomena derive from the lexical entries of specific words. Another appealing aspect of categorial grammars is that it is often easy to assign them a compositional semantics, by first assigning interpretation types to all the basic categories, and then associating all the
derived categories In mathematics, the derived category ''D''(''A'') of an abelian category ''A'' is a construction of homological algebra introduced to refine and in a certain sense to simplify the theory of derived functors defined on ''A''. The construction proce ...
with appropriate
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
types. The interpretation of any constituent is then simply the value of a function at an argument. With some modifications to handle
intensionality In any of several fields of study that treat the use of signs — for example, in linguistics, logic, mathematics, semantics, semiotics, and philosophy of language — an intension is any Property (philosophy), property or Quality (philosophy), ...
and quantification, this approach can be used to cover a wide variety of semantic phenomena.


Lambek calculus

A Lambek grammar is an elaboration of this idea that has a concatenation operator for types, and several other inference rules. Mati Pentus has shown that these still have the generative capacity of context-free grammars. For the Lambek calculus, there is a type concatenation operator \star, so that \text\subseteq \text(\text) and if X, Y\in \text(\text) then (X/Y), (X\backslash Y), (X\star Y)\in \text(\text). The Lambek calculus consists of several deduction rules, which specify how type inclusion assertions can be derived. In the following rules, upper case roman letters stand for types, upper case Greek letters stand for sequences of types. A sequent of the form X \leftarrow \Gamma can be read: a string is of type if it consists of the concatenation of strings of each of the types in . If a type is interpreted as a set of strings, then the ← may be interpreted as ⊇, that is, "includes as a subset". A horizontal line means that the inclusion above the line implies the one below the line. The process is begun by the Axiom rule, which has no antecedents and just says that any type includes itself. : \text\quad The Cut rule says that inclusions can be composed. : \text \quad The other rules come in pairs, one pair for each type construction operator, each pair consisting of one rule for the operator in the target, one in the source, of the arrow. The name of a rule consists of the operator and an arrow, with the operator on the side of the arrow on which it occurs in the conclusion. : For an example, here is a derivation of "type raising", which says that (B/A)\backslash B \leftarrow A. The names of rules and the substitutions used are to the right. : \dfrac \qquad \begin \mbox\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\\ \\ \qquad\qquad\qquad\\ \end


Relation to context-free grammars

Recall that a context-free grammar is a 4-tuple G = (V,\, \Sigma,\, ::=,\, S) where # V\, is a finite set of ''non-terminals'' or ''variables''. # \Sigma\, is a finite set of ''terminal symbols''. # ::=\, is a finite set of production rules, that is, a finite relation (::=)\subseteq V \times (V \cup \Sigma)^*. # S\, is the start variable. From the point of view of categorial grammars, a context-free grammar can be seen as a calculus with a set of special purpose axioms for each language, but with no type construction operators and no inference rules except Cut. Specifically, given a context-free grammar as above, define a categorial grammar (\text,\, \Sigma,\, \triangleleft,\, S) where \text=V\cup\Sigma, and \text(\text)=\text\,\!. Let there be an axiom for every symbol x \in V\cup\Sigma, an axiom for every production rule X ::= \Gamma\,\!, a lexicon entry for every terminal symbol s \in \Sigma, and Cut for the only rule. This categorial grammar generates the same language as the given CFG. Of course, this is not a basic categorial grammar, since it has special axioms that depend upon the language; i.e. it is not lexicalized. Also, it makes no use at all of non-primitive types. To show that any context-free language can be generated by a basic categorial grammar, recall that any context-free language can be generated by a context-free grammar in
Greibach normal form In formal language theory, a context-free grammar is in Greibach normal form (GNF) if the right-hand sides of all production rules start with a terminal symbol, optionally followed by some variables. A non-strict form allows one exception to this f ...
. The grammar is in Greibach normal form if every production rule is of the form A ::= s A_0 \ldots A_, where capital letters are variables, s \in \Sigma, and N\ge 0, that is, the right side of the production is a single terminal symbol followed by zero or more (non-terminal) variables. Now given a CFG in Greibach normal form, define a basic categorial grammar with a primitive type for each non-terminal variable \text=V\,\!, and with an entry in the lexicon A/A_/ \ldots /A_0 \triangleleft s , for each production rule A ::= s A_0 \ldots A_. It is fairly easy to see that this basic categorial grammar generates the same language as the original CFG. Note that the lexicon of this grammar will generally assign multiple types to each symbol. The same construction works for Lambek grammars, since they are an extension of basic categorial grammars. It is necessary to verify that the extra inference rules do not change the generated language. This can be done and shows that every context-free language is generated by some Lambek grammar. To show the converse, that every language generated by a Lambek grammar is context-free, is much more difficult. It was an open problem for nearly thirty years, from the early 1960s until about 1991 when it was proven by Pentus. The basic idea is, given a Lambek grammar, (\text,\, \Sigma,\, \triangleleft,\, S) construct a context-free grammar (V,\, \Sigma,\, ::=,\, S) with the same set of terminal symbols, the same start symbol, with variables some (not all) types V\subseteq \text(\text)\,\!, and with a production rule T::=\text\,\! for each entry T\triangleleft\text in the lexicon, and production rules T::=\Gamma\,\! for certain sequents T\leftarrow\Gamma that are derivable in the Lambek calculus. Of course, there are infinitely many types and infinitely many derivable sequents, so in order to make a finite grammar it is necessary put a bound on the size of the types and sequents that are needed. The heart of Pentus's proof is to show that there is such a finite bound.


Notation

The notation in this field is not standardized. The notations used in formal language theory, logic, category theory, and linguistics, conflict with each other. In logic, arrows point to the more general from the more particular, that is, to the conclusion from the hypotheses. In this article, this convention is followed, i.e. the target of the arrow is the more general (inclusive) type. In logic, arrows usually point left to right. In this article this convention is reversed for consistency with the notation of context-free grammars, where the single non-terminal symbol is always on the left. We use the symbol ::= in a production rule as in Backus–Naur form. Some authors use an arrow, which unfortunately may point in either direction, depending on whether the grammar is thought of as generating or recognizing the language. Some authors on categorial grammars write B\backslash A instead of A\backslash B. The convention used here follows Lambek and algebra.


Historical notes

The basic ideas of categorial grammar date from work by
Kazimierz Ajdukiewicz Kazimierz Ajdukiewicz (12 December 1890 – 12 April 1963) was a Polish philosopher and logician, a prominent figure in the Lwów–Warsaw school of logic. He originated many novel ideas in semantics. Among these was categorial grammar, a highly ...
(in 1935) and
Yehoshua Bar-Hillel Yehoshua Bar-Hillel ( he, יהושע בר-הלל; 8 September 1915, in Vienna – 25 September 1975, in Jerusalem) was an Israeli philosopher, mathematician, and linguist. He was a pioneer in the fields of machine translation and formal linguis ...
(in 1953). In 1958,
Joachim Lambek Joachim "Jim" Lambek (5 December 1922 – 23 June 2014) was a German-born Canadian mathematician. He was Peter Redpath Emeritus Professor of Pure Mathematics at McGill University, where he earned his PhD degree in 1950 with Hans Zassenhaus a ...
introduced a syntactic calculus that formalized the function
type constructors In the area of mathematical logic and computer science known as type theory, a type constructor is a feature of a typed formal language that builds new types from old ones. Basic types are considered to be built using nullary type constructors. S ...
along with various rules for the combination of functions. This calculus is a forerunner of linear logic in that it is a
substructural logic In logic, a substructural logic is a logic lacking one of the usual structural rules (e.g. of classical and intuitionistic logic), such as weakening, contraction, exchange or associativity. Two of the more significant substructural logics are r ...
. Montague grammar uses an ad hoc syntactic system for English that is based on the principles of categorial grammar. Although Montague's work is sometimes regarded as syntactically uninteresting, it helped to bolster interest in categorial grammar by associating it with a highly successful formal treatment of natural language
semantics Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comp ...
. More recent work in categorial grammar has focused on the improvement of syntactic coverage. One formalism that has received considerable attention in recent years is Steedman and Szabolcsi's
combinatory categorial grammar Combinatory categorial grammar (CCG) is an efficiently parsable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate–argument structur ...
, which builds on combinatory logic invented by
Moses Schönfinkel Moses Ilyich Schönfinkel (russian: Моисей Исаевич Шейнфинкель, translit=Moisei Isai'evich Sheinfinkel; 29 September 1888 – 1942) was a logician and mathematician, known for the invention of combinatory logic. Life Mose ...
and
Haskell Curry Haskell Brooks Curry (; September 12, 1900 – September 1, 1982) was an American mathematician and logician. Curry is best known for his work in combinatory logic. While the initial concept of combinatory logic was based on a single paper by ...
. There are a number of related formalisms of this kind in linguistics, such as
type logical grammar Type may refer to: Science and technology Computing * Typing, producing text via a keyboard, typewriter, etc. * Data type, collection of values used for computations. * File type * TYPE (DOS command), a command to display contents of a file. * Typ ...
and
abstract categorial grammar Abstract may refer to: * ''Abstract'' (album), 1962 album by Joe Harriott * Abstract of title a summary of the documents affecting title to parcel of land * Abstract (law), a summary of a legal document * Abstract (summary), in academic publishi ...
.


Some definitions

;Derivation: A derivation is a binary tree that encodes a proof. ;Parse tree: A parse tree displays a derivation, showing the syntactic structure of a sentence. ;Functor and argument: In a right (left) function application, the node of the type A\B (B/A) is called the functor, and the node of the type A is called an argument. ;Functor–argument structure


Refinements of categorial grammar

A variety of changes to categorial grammar have been proposed to improve syntactic coverage. Some of the most common are listed below.


Features and subcategories

Most systems of categorial grammar subdivide categories. The most common way to do this is by tagging them with
features Feature may refer to: Computing * Feature (CAD), could be a hole, pocket, or notch * Feature (computer vision), could be an edge, corner or blob * Feature (software design) is an intentional distinguishing characteristic of a software item ...
, such as
person A person ( : people) is a being that has certain capacities or attributes such as reason, morality, consciousness or self-consciousness, and being a part of a culturally established form of social relations such as kinship, ownership of prope ...
,
gender Gender is the range of characteristics pertaining to femininity and masculinity and differentiating between them. Depending on the context, this may include sex-based social structures (i.e. gender roles) and gender identity. Most cultures ...
,
number A number is a mathematical object used to count, measure, and label. The original examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual numbers c ...
, and tense. Sometimes only atomic categories are tagged in this way. In Montague grammar, it is traditional to subdivide function categories using a multiple slash convention, so ''A/B'' and ''A//B'' would be two distinct categories of left-applying functions, that took the same arguments but could be distinguished between by other functions taking them as arguments.


Function composition

Rules of function composition are included in many categorial grammars. An example of such a rule would be one that allowed the concatenation of a constituent of type ''A/B'' with one of type ''B/C'' to produce a new constituent of type ''A/C''. The semantics of such a rule would simply involve the composition of the functions involved. Function composition is important in categorial accounts of
conjunction Conjunction may refer to: * Conjunction (grammar), a part of speech * Logical conjunction, a mathematical operator ** Conjunction introduction, a rule of inference of propositional logic * Conjunction (astronomy), in which two astronomical bodies ...
and extraction, especially as they relate to phenomena like
right node raising In linguistics, the term right node raising (RNR) denotes a sharing mechanism that sees the material to the immediate right of parallel structures being in some sense "shared" by those parallel structures, e.g. '' am likesbut red dislikesthe debat ...
. The introduction of function composition into a categorial grammar leads to many kinds of derivational ambiguity that are vacuous in the sense that they do not correspond to semantic ambiguities.


Conjunction

Many categorial grammars include a typical conjunction rule, of the general form ''X CONJ X → X'', where ''X'' is a category. Conjunction can generally be applied to nonstandard constituents resulting from type raising or function composition..


Discontinuity

The grammar is extended to handle linguistic phenomena such as discontinuous idioms, gapping and extraction.


See also

*
Combinatory categorial grammar Combinatory categorial grammar (CCG) is an efficiently parsable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate–argument structur ...
*
Link grammar Link grammar (LG) is a theory of syntax by Davy Temperley and Daniel Sleator which builds relations between pairs of words, rather than constructing constituents in a phrase structure hierarchy. Link grammar is similar to dependency grammar, but d ...
* Noncommutative logic *
Pregroup Grammar Pregroup grammar (PG) is a grammar formalism intimately related to categorial grammars. Much like categorial grammar (CG), PG is a kind of type logical grammar. Unlike CG, however, PG does not have a distinguished function type. Rather, PG uses in ...
*
Scope Scope or scopes may refer to: People with the surname * Jamie Scope (born 1986), English footballer * John T. Scopes (1900–1970), central figure in the Scopes Trial regarding the teaching of evolution Arts, media, and entertainment * Cinem ...
*
Type shifter Type may refer to: Science and technology Computing * Typing, producing text via a keyboard, typewriter, etc. * Data type, collection of values used for computations. * File type * TYPE (DOS command), a command to display contents of a file. * ...


References

* * * * * * * * * * *


Further reading

* Michael Moortgat, ''Categorial Type Logics'', Chapter 2 in J. van Benthem and A. ter Meulen (eds.) ''Handbook of Logic and Language''. Elsevier, 1997, * Wojciech Buszkowski, ''Mathematical linguistics and proof theory'', Chapter 12 in J. van Benthem and A. ter Meulen (eds.) ''Handbook of Logic and Language''. Elsevier, 1997, * * *


External links


Grammar, categorial
at Springer
Encyclopaedia of Mathematics The ''Encyclopedia of Mathematics'' (also ''EOM'' and formerly ''Encyclopaedia of Mathematics'') is a large reference work in mathematics. Overview The 2002 version contains more than 8,000 entries covering most areas of mathematics at a graduat ...
* http://plato.stanford.edu/entries/typelogical-grammar/ {{Formal semantics Grammar frameworks Formal languages Computational linguistics Type theory Semantics